Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Classifier Adaptation with Non-representative Training Data

Identifieur interne : 001927 ( Main/Exploration ); précédent : 001926; suivant : 001928

Classifier Adaptation with Non-representative Training Data

Auteurs : Sriharsha Veeramachaneni [États-Unis] ; George Nagy (informaticien) [États-Unis]

Source :

RBID : ISTEX:BA6AC24A377F2F9A6379DAC3467543B5C8B7A845

Abstract

Abstract: We propose an adaptive methodology to tune the decision boundaries of a classifier trained on non-representative data to the statistics of the test data to improve accuracy. Specifically, for machine printed and handprinted digit recognition we demonstrate that adapting the class means alone can provide considerable gains in recognition. On machineprinted digits we adapt to the typeface, on hand-print to the writer. We recognize the digits with a Gaussian quadratic classifier when the style of the test set is represented by a subset of the training set, and also when it is not represented in the training set. We compare unsupervised adaptation and style-constrained classification on isogenous test sets of five machine-printed and two hand-printed NIST data sets. Both estimating mean and imposing style constraints reduce the error-rate in almost every case, and neither ever results in signi.cant loss. They are comparable under the first scenario (specialization), but adaptation is better under the second (new style). Adaptation is bene.cial when the test is large enough (even if only ten samples of each class by one writer in a 100- dimensional feature space), but style conscious classification is the only option with fields of only two or three digits.

Url:
DOI: 10.1007/3-540-45869-7_17


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Classifier Adaptation with Non-representative Training Data</title>
<author>
<name sortKey="Veeramachaneni, Sriharsha" sort="Veeramachaneni, Sriharsha" uniqKey="Veeramachaneni S" first="Sriharsha" last="Veeramachaneni">Sriharsha Veeramachaneni</name>
</author>
<author>
<name sortKey="Nagy, George" sort="Nagy, George" uniqKey="Nagy G" first="George" last="Nagy">George Nagy (informaticien)</name>
<affiliation>
<country>États-Unis</country>
<placeName>
<settlement type="city">Troy (New York</settlement>
<region type="state">État de New York</region>
</placeName>
<orgName type="lab" n="5">Institut polytechnique Rensselaer</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:BA6AC24A377F2F9A6379DAC3467543B5C8B7A845</idno>
<date when="2002" year="2002">2002</date>
<idno type="doi">10.1007/3-540-45869-7_17</idno>
<idno type="url">https://api.istex.fr/document/BA6AC24A377F2F9A6379DAC3467543B5C8B7A845/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000B14</idno>
<idno type="wicri:Area/Istex/Curation">000B01</idno>
<idno type="wicri:Area/Istex/Checkpoint">001041</idno>
<idno type="wicri:doubleKey">0302-9743:2002:Veeramachaneni S:classifier:adaptation:with</idno>
<idno type="wicri:Area/Main/Merge">001A07</idno>
<idno type="wicri:Area/Main/Curation">001927</idno>
<idno type="wicri:Area/Main/Exploration">001927</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Classifier Adaptation with Non-representative Training Data</title>
<author>
<name sortKey="Veeramachaneni, Sriharsha" sort="Veeramachaneni, Sriharsha" uniqKey="Veeramachaneni S" first="Sriharsha" last="Veeramachaneni">Sriharsha Veeramachaneni</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Rensselaer Polytechnic Institute, 12180, Troy, NY</wicri:regionArea>
<placeName>
<region type="state">État de New York</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author>
<name sortKey="Nagy, George" sort="Nagy, George" uniqKey="Nagy G" first="George" last="Nagy">George Nagy (informaticien)</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Rensselaer Polytechnic Institute, 12180, Troy, NY</wicri:regionArea>
<placeName>
<region type="state">État de New York</region>
</placeName>
<placeName>
<settlement type="city">Troy (New York</settlement>
<region type="state">État de New York</region>
</placeName>
<orgName type="lab" n="5">Institut polytechnique Rensselaer</orgName>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2002</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">BA6AC24A377F2F9A6379DAC3467543B5C8B7A845</idno>
<idno type="DOI">10.1007/3-540-45869-7_17</idno>
<idno type="ChapterID">17</idno>
<idno type="ChapterID">Chap17</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: We propose an adaptive methodology to tune the decision boundaries of a classifier trained on non-representative data to the statistics of the test data to improve accuracy. Specifically, for machine printed and handprinted digit recognition we demonstrate that adapting the class means alone can provide considerable gains in recognition. On machineprinted digits we adapt to the typeface, on hand-print to the writer. We recognize the digits with a Gaussian quadratic classifier when the style of the test set is represented by a subset of the training set, and also when it is not represented in the training set. We compare unsupervised adaptation and style-constrained classification on isogenous test sets of five machine-printed and two hand-printed NIST data sets. Both estimating mean and imposing style constraints reduce the error-rate in almost every case, and neither ever results in signi.cant loss. They are comparable under the first scenario (specialization), but adaptation is better under the second (new style). Adaptation is bene.cial when the test is large enough (even if only ten samples of each class by one writer in a 100- dimensional feature space), but style conscious classification is the only option with fields of only two or three digits.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>État de New York</li>
</region>
<settlement>
<li>Troy (New York</li>
</settlement>
<orgName>
<li>Institut polytechnique Rensselaer</li>
</orgName>
</list>
<tree>
<country name="États-Unis">
<region name="État de New York">
<name sortKey="Veeramachaneni, Sriharsha" sort="Veeramachaneni, Sriharsha" uniqKey="Veeramachaneni S" first="Sriharsha" last="Veeramachaneni">Sriharsha Veeramachaneni</name>
</region>
<name sortKey="Nagy, George" sort="Nagy, George" uniqKey="Nagy G" first="George" last="Nagy">George Nagy (informaticien)</name>
<name sortKey="Veeramachaneni, Sriharsha" sort="Veeramachaneni, Sriharsha" uniqKey="Veeramachaneni S" first="Sriharsha" last="Veeramachaneni">Sriharsha Veeramachaneni</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001927 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001927 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:BA6AC24A377F2F9A6379DAC3467543B5C8B7A845
   |texte=   Classifier Adaptation with Non-representative Training Data
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024